fix(sf): dry-run the ReportCard + Director advisory tail on the Friday preflight (ROADMAP L4504)#372
Merged
Merged
Conversation
The Friday-PM Preflight Pipeline (shell_run=true) dry-executes the Saturday
SF to exercise bootstrap paths, but ReportCard + Director have no dry path
(their payload is only {date}; the Director Lambda gates solely on
DIRECTOR_ENABLED). Left ungated, once DIRECTOR_ENABLED is flipped on a Friday
preflight would run ReportCard for real over backtest/{Fri-date}/* the dry
workload never wrote (a degenerate, mostly-N/A card) and fire a real Opus
Director call that merges that plan into the SHARED, non-date-scoped carry-over
ledger (director/carryover_ledger.json) — polluting the state the real Saturday
run reads. Correctness bug, not just wasted cost.
Adds a CheckShellRunSkipDirector Choice between WaitForWeeklySubstrateHealthCheck
and ReportCard: on shell_run=true it hard-skips BOTH advisory states straight to
the shell-run-aware notify gate; shell_run absent/false → Default → ReportCard,
byte-identical to the pre-guard real Saturday run. The preflight's purpose is
bootstrap exercise and both advisory Lambdas already have their own canaries.
Unblocks the Phase-F DIRECTOR_ENABLED=true flip. Tests updated +
test_shell_run_skips_advisory_tail added (497 SF-structure tests pass).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
Note for the record: this PR merged the hard-skip |
cipher813
added a commit
that referenced
this pull request
Jun 5, 2026
#373) Follow-up to #372 (merged), which shipped the hard-skip CheckShellRunSkipDirector Choice gate. Per the chosen approach, convert it to keystone-consistent dry- execution so the Friday preflight still exercises the ReportCard + Director Lambda bootstrap/import/IAM/transport paths instead of skipping them (the shell- run keystone deliberately removed all skip-exceptions). - Remove the CheckShellRunSkipDirector Choice; restore the direct WaitForWeeklySubstrateHealthCheck -> ReportCard edge. - Thread dry_run.$=$.research_dry into the ReportCard + Director payloads (the canonical shell-run-dry signal; false on the real Saturday run / true on the preflight), mirroring the eval-judge / rationale-clustering / replay-concordance / counterfactual advisory Lambdas. Handlers (companion alpha-engine-evaluator #23): ReportCard dry -> no-write (still boots/imports/reads/computes); Director dry -> no-Opus / no-write probe (constructs the real client to validate the langchain import + SSM key-fetch IAM grant, reads the ledger, stops short of .invoke(), mutates nothing — the shared non-date-scoped carry-over ledger is never polluted). Tests updated to expect the dry-execution wiring + test_advisory_tail_runs_dry_on_preflight + payload-key registry. 497 SF-structure tests pass. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes the L4504 pre-flip blocker (config ROADMAP, #477 merged). SF half; companion handler PR: alpha-engine-evaluator #23.
Problem
The Friday preflight (
shell_run=true) dry-executes the Saturday SF.ReportCard+Directorwere added after the keystone with no dry path (payload only{date}; the Director Lambda gates solely onDIRECTOR_ENABLED). OnceDIRECTOR_ENABLEDis flipped on, a preflight would runReportCardfor real overbacktest/{Fri-date}/*the dry workload never wrote → degenerate card → fire a real OpusDirectorcall that merges that plan into the SHARED, non-date-scoped carry-over ledgerdirector/carryover_ledger.json, polluting the state the real Saturday run reads. Correctness bug, not just wasted cost.Fix
Thread
"dry_run.$": "$.research_dry"into both payloads — the canonical shell-run-dry signal (seededfalseinInitializeInput, flippedtrueinApplyShellRunDefaults), exactly how the eval-judge / rationale-clustering / replay-concordance / counterfactual advisory Lambdas already run dry. The handlers (evaluator #23) then:langchain-anthropicimport + the SSM key-fetch IAM grant), reads the ledger, stops short of.invoke(), mutates nothing.WaitForWeeklySubstrateHealthCheck → ReportCard → Directorwiring is unchanged; on a preflight both states run dry, they are not skipped.Tests
WaitForWeeklySubstrateHealthCheck → ReportCardedge.test_advisory_tail_runs_dry_on_preflight(assertsdry_run.$=$.research_dryon both payloads)._SATURDAY_PAYLOAD_KEYSregistry for the new key.Deploy note
SF auto-deploys on merge to
main; the merged data #371 (--definition file://) fix clears the priorARG_MAXlimit. Both this + evaluator #23 should land together (the SF references the handlerdry_runcontract).🤖 Generated with Claude Code